113 research outputs found

    On Complexity, Energy- and Implementation-Efficiency of Channel Decoders

    Full text link
    Future wireless communication systems require efficient and flexible baseband receivers. Meaningful efficiency metrics are key for design space exploration to quantify the algorithmic and the implementation complexity of a receiver. Most of the current established efficiency metrics are based on counting operations, thus neglecting important issues like data and storage complexity. In this paper we introduce suitable energy and area efficiency metrics which resolve the afore-mentioned disadvantages. These are decoded information bit per energy and throughput per area unit. Efficiency metrics are assessed by various implementations of turbo decoders, LDPC decoders and convolutional decoders. New exploration methodologies are presented, which permit an appropriate benchmarking of implementation efficiency, communications performance, and flexibility trade-offs. These exploration methodologies are based on efficiency trajectories rather than a single snapshot metric as done in state-of-the-art approaches.Comment: Submitted to IEEE Transactions on Communication

    TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization

    Full text link
    Recent research implies that training and inference of deep neural networks (DNN) can be computed with low precision numerical representations of the training/test data, weights and gradients without a general loss in accuracy. The benefit of such compact representations is twofold: they allow a significant reduction of the communication bottleneck in distributed DNN training and faster neural network implementations on hardware accelerators like FPGAs. Several quantization methods have been proposed to map the original 32-bit floating point problem to low-bit representations. While most related publications validate the proposed approach on a single DNN topology, it appears to be evident, that the optimal choice of the quantization method and number of coding bits is topology dependent. To this end, there is no general theory available, which would allow users to derive the optimal quantization during the design of a DNN topology. In this paper, we present a quantization tool box for the TensorFlow framework. TensorQuant allows a transparent quantization simulation of existing DNN topologies during training and inference. TensorQuant supports generic quantization methods and allows experimental evaluation of the impact of the quantization on single layers as well as on the full topology. In a first series of experiments with TensorQuant, we show an analysis of fix-point quantizations of popular CNN topologies

    A Reconfigurable Outer Modem Platform for Future Communications Systems

    Get PDF
    Future mobile and wireless communications networks require flexible modem architectures with high performance. Efficient utilization of application specific flexibility is key to fulfill these requirements. For high throughput a single processor can not provide the necessary computational power. Hence multi-processor architectures become necessary. This paper presents a multi-processor platform based on a new dynamically reconfigurable application specific instruction set processor (dr-ASIP) for the application domain of channel decoding. Inherently parallel decoding tasks can be mapped onto individual processing nodes. The implied challenging inter-processor communication is efficiently handled by a Network-on-Chip (NoC) such that the throughput of each node is not degraded. The dr-ASIP features Viterbi and Log-MAP decoding for support of convolutional and turbo codes of more than 10 currently specified mobile and wireless standards. Furthermore, its flexibility allows for adaptation to future systems

    Efficient Hardware Implementation of Constant Time Sampling for HQC

    Full text link
    HQC is one of the code-based finalists in the last round of the NIST post quantum cryptography standardization process. In this process, security and implementation efficiency are key metrics for the selection of the candidates. A critical compute kernel with respect to efficient hardware implementations and security in HQC is the sampling method used to derive random numbers. Due to its security criticality, recently an updated sampling algorithm was presented to increase its robustness against side-channel attacks. In this paper, we pursue a cross layer approach to optimize this new sampling algorithm to enable an efficient hardware implementation without comprising the original algorithmic security and side-channel attack robustness. We compare our cross layer based implementation to a direct hardware implementation of the original algorithm and to optimized implementations of the previous sampler version. All implementations are evaluated using the Xilinx Artix 7 FPGA. Our results show that our approach reduces the latency by a factor of 24 compared to the original algorithm and by a factor of 28 compared to the previously used sampler with significantly less resources

    A Hybrid Approach combining ANN-based and Conventional Demapping in Communication for Efficient FPGA-Implementation

    Full text link
    In communication systems, Autoencoder (AE) refers to the concept of replacing parts of the transmitter and receiver by artificial neural networks (ANNs) to train the system end-to-end over a channel model. This approach aims to improve communication performance, especially for varying channel conditions, with the cost of high computational complexity for training and inference. Field-programmable gate arrays (FPGAs) have been shown to be a suitable platform for energy-efficient ANN implementation. However, the high number of operations and the large model size of ANNs limit the performance on resource-constrained devices, which is critical for low latency and high-throughput communication systems. To tackle his challenge, we propose a novel approach for efficient ANN-based remapping on FPGAs, which combines the adaptability of the AE with the efficiency of conventional demapping algorithms. After adaption to channel conditions, the channel characteristics, implicitly learned by the ANN, are extracted to enable the use of optimized conventional demapping algorithms for inference. We validate the hardware efficiency of our approach by providing FPGA implementation results and by comparing the communication performance to that of conventional systems. Our work opens a door for the practical application of ANN-based communication algorithms on FPGAs.Comment: Available at: https://ieeexplore.ieee.org/document/983569

    Advanced Wireless Digital Baseband Signal Processing Beyond 100 Gbit/s

    Get PDF
    International audienceThe continuing trend towards higher data rates in wireless communication systems will, in addition to a higher spectral efficiency and lowest signal processing latencies, lead to throughput requirements for the digital baseband signal processing beyond 100 Gbit/s, which is at least one order of magnitude higher than the tens of Gbit/s targeted in the 5G standardization. At the same time, advances in silicon technology due to shrinking feature sizes and increased performance parameters alone won't provide the necessary gain, especially in energy efficiency for wireless transceivers, which have tightly constrained power and energy budgets. In this paper, we highlight the challenges for wireless digital baseband signal processing beyond 100 Gbit/s and the limitations of today's architectures. Our focus lies on the channel decoding and MIMO detection, which are major sources of complexity in digital baseband signal processing. We discuss techniques on algorithmic and architectural level, which aim to close this gap. For the first time we show Turbo-Code decoding techniques towards 100 Gbit/s and a complete MIMO receiver beyond 100 Gbit/s in 28 nm technology

    Unsupervised ANN-Based Equalizer and Its Trainable FPGA Implementation

    Full text link
    In recent years, communication engineers put strong emphasis on artificial neural network (ANN)-based algorithms with the aim of increasing the flexibility and autonomy of the system and its components. In this context, unsupervised training is of special interest as it enables adaptation without the overhead of transmitting pilot symbols. In this work, we present a novel ANN-based, unsupervised equalizer and its trainable field programmable gate array (FPGA) implementation. We demonstrate that our custom loss function allows the ANN to adapt for varying channel conditions, approaching the performance of a supervised baseline. Furthermore, as a first step towards a practical communication system, we design an efficient FPGA implementation of our proposed algorithm, which achieves a throughput in the order of Gbit/s, outperforming a high-performance GPU by a large margin.Comment: accepted for publication at Joint European Conference on Networks and Communications & 6G Summit (EuCNC/6G Summit), Gothenburg, Sweden, 6 - 9 June 202
    • …
    corecore